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Abstract 

Unbiased location- and scale-invariant 'elemental' estimators for the GPD tail parameter are con- 
structed. Each involves three log-spacings. The estimators are unbiased for finite sample sizes, even 
as small as A' = 3. It is shown that the elementals form a complete basis for unbiased location- and 
scale-invariant estimators constructed from linear combinations of log-spacings. Preliminary numerical 
evidence is presented which suggests that elemental combinations can be constructed which are consistent 
estimators of the tail parameter for samples drawn from the pure GPD family. 
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1 Introduction 

The Generalized Pareto Distribution (GPD) and the Generahzed Extreme Value (GEV) distribution play a 
central role in extreme value theory. Each has three parameters (jj, cr, ^) corresponding to location, scale and 
tail (or shape) respectively. This paper describes a particularly simple set of location- and scale-invariant 
'elemental' estimators for the GPD tail parameter Each 'elemental' involves three log-spacings of the data, 
and each is unbiased over all tail parameters -oo < ^ < oo, and for all sample sizes, as small as A^ = 3. 
The elemental estimators (illustrated in Figure [1} have the form 
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and the Xj are the upper-order statistics, numbered in decreasing order starting from / = 1 as the data 
maximum. 
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Figure 1 : Between any two non-adjacent data points Xj and Xj an elemental estimator ^/y can be defined. It 
involves three log-spacings - the one between Xj and Xj, together with two shorter log-spacings connecting 
each end-point to the data point immediately inside the other end. 



The Generalized Pareto Distribution arises as the limiting distribution of maxima in Peaks-Over- Threshold 
approaches (see for example Embrechts et al.l(ll999l) V It has distribution function: 



F{x) = 1 



H"^) 



-i/f 



(2) 



The parameters /j. and ^ can take any value on the real line, whilst cr can be any positive value (and when 
^ = the distribution function ^ becomes the exponential distribution). For GPDs with positive ^, the 
support (ju < x) is unbounded at the right, giving long- or heavy-tailed distributions. For ^ negative, the 
support is bounded both below and above (ju < x < /i - cr/^). 



2 Other estimators 

Estimators for the tail parameter can be loosely classed in to: maximum likelihood (M L); method of mo- 
ments ; Pickands-like and Bayesian. Standard texts such as Embrechts et al.l ( Il999h andjReiss and Tho masI 

( 2001 ) provide detailed background, with .Coles. (.20011) giving the Bayesian perspective. Ide Zea Bermudez and Kotz 
(120101) provide a comprehensive review, such that only a brief su rvey is presented here. 



The maximum likelihood approach to the GPD is described in lSmithI (119871) . Although it possesses some 
desirable properties, the numerical maximization algorithms can experience problems for small sample 
sizes and for negat ive tail parameters, a s there are occasions when the likelihood function does not possess 
a local maximum (Castillo and Daoudi ( 20091) ). To avoid such problems, a method of moments approach 
was proposed bv Hosking and WallisI (Il987h . 

The classical tail parameter estimator is that of lHilll (Il975b . However, it is not location invariant and is 
only valid in the heavy-tailed Freche t region (f posi t ive) of the GEV, although an extensio n into the Weibul l 



region (^ negative) was proposed bv lDekkers et al.l (119891) using the method of moments. iPickandsl ( Il975h 
proposed an estimator based on log-spacings which overcame many of these shortcomi ngs. This estimator 
is pop ular in current applications, and a substantial literature exists on its generalization (Drees ( 19981); lYun 
(120021) . for example), the most general and efficient of which appear to be those of iSegers (2005) . derived 
using second order theory of regular variation. These are optimised for estimation of the tail index in the 
more general case of data drawn from any distribution within the domain of attraction of the particular GPD. 
Although the main concern of Extreme Value Theory is the domain of attraction case, this paper restricts 
attention to distributions within the pure GPD family. The possibility that results derived in this specific 
setting may be extended to the more general case is left for later consideration. 

Throughout, there is an emphasis on results that are valid for small sample sizes. 



3 Elemental Estimators 

The main result here is the proof in Appendix 1 that each elemental estimator is absolutely unbiased within 
the GPD family. That the proof is valid, remarkably, for ALL ^ may be appreciated by inspection of Eqn. Q 
there. For ^ negative, the expectation of the log-spacing is expressed in terms of the tail probabilities G, 
and Gj via a term log(G^ - G^) with -y - -^. This trivially decomposes into a simple term ylogGj and a 
complicated term log( 1 - (Gi/GjY). The proof shows how the simple terms provide the expectation | - -y, 
and how the elementals combine the complicated terms in such a manner that they cancel (obviating the 
need to evaluate them explicitly). For ^ positive, the absolute lack of bias is maintained by an additional 
simple term y log G,Gj which adds 2y to the -y result for ^ negative. This elegant correspondence be- 
tween the results for ^ positive and negative is absent in previous approaches to GPD tail estimation. As 
further demonstration of the absolute lack of bias of each elemental triplet, even at small sample sizes, the 
numerically-returned average values for each of the fifteen elemental estimators available for N - 1 are 
shown in Fig.|2]over a wide range of tail parameters (-10 < ^ < 10). 



Performance of elemental estimators, N=7 
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Figure 2: Averages over 50,000 samples for the 15 elemental estimators for N - 1 over a range of ^. 
The fifteen lines are almost indistinguishable from the diagonal, indicating that each is indeed absolutely 
unbiased. 

4 Linear combinations 

Trivially, any unit-weight linear combination of elementals will also be unbiased. Whilst it will be of 
interest to form efficient combinations, no detailed analysis of efficiency or variance is undertaken here. 
Instead, the performance of a simple combination is reported. 

Linear combinations of elementals are most conveniently described via the upper triangular matrix M of 
dimensionA^xA^ofTable[T]containing all possible log-spacings. The general term is M/y — log(X/ - Xy) for 
J > I + \ and zero otherwise. (Use of the N xN form with zero diagonal allows for easier indexing). Each 
element above the secondary diagonal {J > I + 2) may be uniquely identified with an elemental estimator, 
involving that log-spacing, that to its left and that below it in M. Corresponding weights of elemental 
estimator combinations may thus be stored in an A^ x A^ upper triangular matrix R. Weighting an elemental 
1/7 by rjj requires (from Eqn. [1} that the three weights r/y x {7 - 1 , -(/ - 1 - /), -/) be given, respectively, 
to the corresponding left, upper-right and lower log-spacings in M. These weights are illustrated in the grid 
G shown in Table |2] The totals may then be collected in an A^ x A^ matrix A of log-spacing weights. The 
sum of the entry wise product of A and M is then the unbiased estimate |. 

In summary, the matrix M is the roadmap of all log-spacings and the grid G gives the set of weights to 
be used within each elemental. A linear combination of elementals is then defined by a unit-sum matrix R, 
and the corresponding log-spacing weights are collected in the zero-sum matrix A. 



An example combination 

A natural choice of linear combination might give equal weight to each elemental. However here we 
give further consideration to a simple "linearly-rising" combination. Numerical experiments indicate that 
many simple choices of linear combination lead to good estimators, and further research may seek the 
optimal combination. There is thus nothing special about the "linearly-rising" combination considered 
here. As will be demonstrated, it has a good all-round performance, but more importantly it illustrates 
the great simplicity that the elementals permit, allowing the ready creation of unbiased tail estimators with 
efficiencies comparable to current leading (and often highly complicated) alternatives. 

The "linearly-rising" combination has elemental weights rjj cc N + I - J and the resulting log-spacing 
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Table 1: The address matrix M for N - 7. Each elemental involves three adjacent cells in an inverted-L 
formation. That corresponding to the elemental I36 is shaded for illustration. 
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Table 2: The grid G of exponents of the elemental estimators, for A^ 
elemental I36 are shaded for illustration. 



7. Again, the terms involved in the 



weights are a/y = 6{2N - 3/ + 2)/{N{N - 1){N - 2)) for J > I + I. For example, forN -7 the weights are 
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Further illustration is given in Figure |3] (centre) for a sample size of 40, showing how the log-spacing 
weights have a simple Unearly-rising distribution with zero mean. For comparison, the unusual pattern of 
the corresponding log-spacing weights of an unconstrained (i.e. optimised but biased) Segers estimator are 
also shown. Since the Segers estimator has only a single non-zero weight in each column it immediately 
follows that it cannot be constructed from the elemental triplets. 
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Figure 3; The A matrix of log-spacing weights for a typical elemental, the combined elementals and a 
Segers estimator 

In Figure m the errors of the elemental combination are compared with those of the unconstrained 
Segers estimator for pure GPDs with the somewhat extreme cases of ^ = ±3, and small sample sizes. The 
elemental combination has comparatively large variance around an unbiased mean, in contrast to the Segers 
estimator which is more tightly bunched around a biased offset. Despite the complexity, the extensive 
optimization and the substantial bias in the Segers estimator, its mean square error is nevertheless typically 
only marginally less than that of the simple elemental combination. 

5 Completeness 

Proposition: if ^ is a linear combination of log-spacings and is an absolutely-unbiased, location- and scale- 
invariant estimator of the tail parameter of the GPD, then | is a linear combination of elementals. 

The proof, presented in Appendix 2, shows that a requirement for lack of bias imposes A^- 1 independent 
constraints on the A^(A^ - l)/2 dimensional space of possible linear combinations. The resulting subspace of 
unbiased estimators thus has dimension {N-l)(N-2)/2 and is that subspace spanned by the elementals. The 
elementals thus form a complete basis for unbiased, location- and scale-invariant log-spacing estimators of 
the GPD tail parameter 



6 Efficiency and Optimality 

Given that the efficiency of the estimator depends on the actual unknown value of the parameter ^, there is 
no unique definition of optimality. The question as to which linear combination is in some sense the 'best' 
is thus a matter of judgement. 

Fig. |5] shows the relative efficiency of various linear combinations of elementals, wherein relative effi- 
ciency is defined with respect to the minimal possible variance (given the tail parameter ^) within the class 
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Figure 4: The mean, (mean ± std. dev.), and (actual + rmse) for the elemental combination (solid lines) and 
the unconstrained Segers (dashed) for 10,000 samples of sizes A^ = 3 to 20 drawn from GPDs with ^ - 3 
and f = -3. Means are highlighted with + and actual-plus-rmse by o. Note the large bias in the Segers 
estimator (particularly for ^ = -3), and note the lack of bias in the elemental combination for samples as 
small as N - 3. 
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Figure 5: The relative efficiency of various linear combinations of elementals (for GPD samples of size 
N - 20 over a range of ^). Efficiency is defined relative to the numerically-computed minimal variance at 
given ^ within the class of location- and scale-invariant unbiased estimators that are linear combinations of 
log-spacings. The linearly-rising combination r,y oc A^ + 1 - y is seen to be a good compromise, giving high 
relative efficiency over the whole range -3 < ^ < 3. 



of location- and scale-invariant unbiased estimators which are linear combinations of log-spacings of GPD 
data. Using the completeness of the elementals, at any ^ and for any sample size A^, the unbiased combina- 
tion giving minimum variance within this class can be estimated numerically by constructing, via repeated 
samples, the numerical covariance matrix for the set of (A^ - 1)(A^ - 2)/2 elementals, and applying a La- 
grange multiplier to enforce the unit sum condition on the coefficients r^. The Lagrange multiplier is then 
an estimate of the minimum variance. Since the computed coefficients are minimal for that set of samples, 
it will, for that sample set, perform better than the actual global optimum, and thus provide a lower bound 



on the minimum variance. The computed coefficients will not be fully optimal for other randomly drawn 
sample sets, and since the global optimum is, on average, optimal for other sets, then an upper bound on the 
minimum possible variance (within this class of estimators) can be obtained by applying the numerically- 
computed optimal coefficients to a large set of samples which were not used in their computation. By this 
procedure, using two separate blocks of 8000 samples of size N - 20 drawn from GPDs, the (approximate) 
optimal linear combination within the class was constructed for various ^. 

The optimal elemental coefficients r^ and corresponding log-spacing coefficients aij computed for ^ = 
are shown in Fig.|6] It can be seen that the coefficients are small near the / ^ j diagonal, rising in amplitude 
near the top corner / » l,j^N-2. At all values of ^ investigated, the optimal coefficients had this 
characteristic. Moreover all exh ibited the deci dedly non- smooth character reminiscent of the measure A in 
Seger's optimisation procedure dSegersI (120051) '). 
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Figure 6: The numerically computed matrix elements r,y and a/j that minimise the variance at ^ = 0. 



Although ^-specific optimal combinations have thus been computed, these are not optimal in any global 
sense, as their performance is far from optimal away from the values of ^ at which they were optimised. 
This can be seen in Fig. |5] where the curve Dl shows the performance of the optimal ^ = combination 
falling away rapidly from perfect relative efficiency for ^ values different from zero. 

Fig. |5]also shows the performance of some representative examples of other linear combinations. Curve 
Al is for equal elemental weights (r,j = constant), showing good performance in the very heavy-tailed 
region, but with much lower efficiency for ^ small or negative. 

Curve Bl has r,j oc 1 /(//(// + 1)), and thus gives much weight to the top row of the matrices, where log- 
spacings are measured from the data maximum. This combination is seen to give excellent performance in 
the ^ negative region, but has low efficiency for ^ positive. 

Curve CI (dashed) is for r^ oc {jj - iif-. This emulates the numerically-computed optimals in rising 
from zero near the diagonal to larger values in the i ^ l,jxN-2 top corner The efficiency is good in the 
region near ^ - I, but is low elsewhere, especially for ^ ^ -1. Interestingly, the efficiency stays close to 
that of the Seger's estimator (+) throughout. 

The relative efficiency of the Seger's estimator is also shown in Fig. |5] The reason it can have a relative 
efficiency greater than unity (as it does near ^ ^ 0.5) is that it is not strictly within the class of estimators 
under consideration, in that it is biased and, moreover, is a nonlinear function of the log-spacings (since 
the weights are adaptively selected after an initial log-spacing-based estimate). It should also be noted 
that, despite having the advantages of bias-variance trade-off and access to a larger class of possibilities, its 
relative efficiency is nevertheless comparatively poor for ^ negative. 

Finally, it can be seen from Fig.|5]that the linearly rising combination with /-,y oc A^ - y + 1 (shown with 
circles) has some plausible claim to being a suitable compromise. It has near optimal performance in the 
region ^ from to 3, which is often of great interest in practice, and although the efficiency falls somewhat 
for ^ negative, it does not do so by much. For this reason, this combination will be considered further 
throughout the remainder of the paper. 



6.1 Consistency - preliminary results 



Although there is as yet no proof of consistency for any elemental combination, numerical evidence sug- 
gests that the "linearly-rising" combination is consistent for samples drawn from within the GPD family. 
Fig. Qshows how the Root Mean Square Error (RMSE) for the "linearly-rising" combination of elementals 
decreases as the sample size grows. At each point on each graph, the RSME was determined numerically 
from 10,000 samples of size A^ drawn from a GPD at various ^ in the range -3 < ^ < 3, with A^ increasing 
from 20 to 1000. At each ^, and for A^ large, the errors appear to be converging with increasing sample size 
A^ at a rate proportional to 1 / V^, with the constant of proportionality dependent on ^. 

There is a consistency proof already in existence which has some relevance here, and covers many ele- 
mental combinations for the more general do main of attraction case (which trivially includes the pure GPD 
case). This is Theorem 3.1 of ISegersI (120051) . which guarantees weak consistency of many elemental com- 
binations in the N -^ oo, k ^ oo, k/N -^ limit. To be covered by this theorem, element al combinatio ns 
must be expressible as a mixture of Segers estimators satisfying a condition (Condition 2.5. ISegersl(l2005h ). 
which re-stated in the notation here, requires the log-spacing weight matrix A to have zero weights in the 
vicinity of the diagonal and of the top row. The linearly-rising combination does not satisfy this condition, 
although it is clear that minor adjustments can be made to zero the weights in the appropriate vicinities. 

Finally, it could be noted that the presence (or lack) of asymptotic consistency results is arguably of 
limited interest if the emphasis, as here, is on providing estimators which perform well with small or 
moderately sized samples. 



Convergence, pure GPD, various^ 
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Figure 7: Dependence of root mean square error RMSE on sample size A^ (20 < N < 1000) for the linearly- 
rising combination, with samples drawn from GPDs with various tail parameters. Positive and negative ^ 
are indicated with circles and crosses respectively. Horizontal axes are scaled to bring infinite A^ to unity, by 
plotting 1 - y/2/N. This numerical evidence suggests that, for large A^, the RMSE decreases in proportion 
to 1 / VA'^, with the constant of proportionality depending on ^. 



7 Summary 



'Elemental' absolutely-unbiased, location- and scale-invariant estimators for the tail parameter ^ of the 
GPD have been presented, valid for all ^ and all A'^ > 3. The elemental estimators were shown to form 
a complete basis for those unbiased, location and scale-invariant estimators of the GPD tail parameter 
which are constructed from linear combinations of log-spacings. Numerical evidence was presented which 



supports consistency of at least one elemental combination for samples drawn from the pure GPD family. 
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Appendix 1: Proof that log r^ V?^ is unbiased for the GPD 

Consider a random variable x with distribution function F{x) and complement G{x) = 1 - F{x), and via 
the probability integral transform, construct the inverse function x - u{G) which maps an exceedence 
probability G to a point x in the data space. Since dG - -dF/dx dx - -p(x) dx where p{x) is the density, 
the expected value of any function h{x) may be evaluated by transforming integrals over x to integrals over 
G, using : 

{h{x)) = I h{x) p(x) dx^ \ h(u{G)) dG (3) 

Jv.r Jo 

A consequence of the well-known uniform density of the G's is that the second integral (over G) can be con- 
siderably simpler than the first integral (over x). For the GPD, integrals via p{x) lead to lengthy expressions 
involving hypergeometric (Lauricella) functions. Although a proof that the elemental estimators are unbi- 
ased has been constructed by that route, the approach via the transformed integrals over G is considerably 
simpler, and is presented here. 

Consider an ordered sample X of n data points drawn from the distribution F, ordered such that X„ < 
X„-i < . . . < X2 < Xi. The expected value of any function li(X) is 



{h(X)}^n\ I dG„... I dGi h(u{ 
Jo Jo 



liG)) (4) 

IQ Jo 

the integral being over the n-dimensional unit simplex containing all possible G. 

/ £(x-u)V^'^ 
The GPD has distribution function F{x) = 1 - 1 H- ^- — (5) 



such that X = u{G) ^^+ -{G'^ - 1) (6) 

Depending whether ^ is positive or negative, the expected value of the log-spacing between the /th and y'th 
order statistics is 



f(log(G}-Gj'))-r<logG,G,)+log2 for^ = r, r>o 
[<log(G] - Gj')> +log^ for^=-r, r>0 






Consider an estimator |(X) = Zi/j'2olog(^; - Xj), a linear combination of log-spacings. For f to be 
scale invariant, the weights fl^ must sum to zero to remove the cr dependence in Eqn. Q. Moreover, for | 
to be unbiased for both positive and negative ^, Eqn. d?) requires 

^fl,/log(G;-G;')> = -r and - ^ fl,-,r<logG,Gy) = 2r (8) 

To determine the expected value of any function h{Xi,Xj), all other G variables can be integrated out, 
leaving 

{h{Xi,Xi)) = di r dGj r ' dGi GY\l - Gjr-\Gj - G^y-'-' h{u{Gd, u{Gj)) (9) 

Jo Jo 

where C,y = «!/((/ - !)!(« - y')!(7 - i — I)!)- For example 

^ fl,/log Gj) = Yj "uCu- j Gf '(1 - Gjf-J log Gj dGj . j <p'-\l - <py-'-' d^ (10) 

where (p = Gj/Gj. The (p integral leads immediately to the beta function B(i,j - /). For the Gj integral, 
standard Mellin transform theory gives 

j^ Gf '(1 - G,)"-^logG, dGj = [^ j^ g;-'^^-'(1 - Gj)"-^ dGj 

= B{j,n-j+\){>lf{j)->lf{n+\)) (11) 

10 



where i^ is the digamma function, the derivative of the logarithm of the gamma function. Both beta functions 
have integer arguments and may be expressed as ratios of factorials. These cancel with the leading factorial 
terms C,j, such that the expected value of the weighted sum is 

2 fl,/log Gj) = Yj «'/ ("AO) - ^{n + 1 )) = 2 aijijfij) ( 1 2) 

the last step following from 2 fl,j = 0. It follows similarly that 

2 fl,V<log Gi)^Yj ""'' ^'A^') " "A^" + 1 ^^ " Z '''■'■'^^'^ ^ ^ ^^ 

For an individual elemental |/y the weights fl,j and the relevant values of / and j are given in Table |3] 

Table 3: The indices of an elemental estimator, and, in the final column, the weights. 
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Using the relation i/r(l + x) = i/r(x) + l/x we obtain the contributions from the individual elemental ^jj 
to be 



Y^auHJ) = (7-l)(A(7-l)-(7-7-l).A(-/)-/«A(-/) 

- (J - l)mj - I) - iP(J)) ^ -1 

Y,aijHi) = (7 - l)iA(7) - (7 - 7 - 1).A(/) - /«A(/ + 1) 

- 7(«A(7) - iA(7 + D) = -1 



(14) 



(15) 



These are exactly what is required to show that any unit-sum linear combination f of elementals provides 
an unbiased estimate for ^ via the {log G'') and (log G'^) terms. Explicitly, with y = |^|, we have 



<^) 



r 2 auilog Gj) + Z «,7<log( 1 - 0'')> for ^ < 
-r Z «<7<log G,) + 2: fl,v<log( l-cf>y)} for ^ > 



^ + J]fl,/log(l-0'')) 



(16) 

(17) 



It only remains to prove that the second term is zero. This term involves somewhat more complicated inte- 
grals leading to sums of digamma functions with non-integer arguments dependent on y. Explicit evaluation 
of these can, however, be avoided here by observing how the elemental terms combine and cancel. 
From Eqn. (|9]l, we obtain 



<log(l-0f.)) = QjDjBij 



where 



and 



yj-\l - y)"-j dy 
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(18) 



Summing over a single elemental using the weights fl,y given in Table |3] and collecting the leading 
UijCijDj terms gives 

(^-1)! 
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3/+i,y 



(19) 



(20) 



{<p'-\i-<py-'-' 



b'-\i - ci>y-'-' - ,^'(1 - <py-'-^} iog(i - ci>-y) d4> 
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where the integrand contains the factor [1 - (1 - 0) - 0] = 0, thus 

2fl,/log(l-0^)> = O V^^O (21) 

This completes the proof for all nonzero ^. 

The proof for ^ = follows similarly, using G{x) = exp(-(x -yu)/cr), whence, using Eqn. ^, we 
obtain 

^ ciijilogiXi - Xj)} = 2 fl,/log cr) + ^ ciijilogl- log cf>ij]) (22) 

The first term is trivially zero due to the zero sum of the elemental a,y, and the second term is zero for the 
same reason that 2 fli/(log(l - 0'')) is zero, as shown above, in that the elementals combine terms in such a 
way as to eliminate the expectation of any function of the 0,y . 

This completes the proof that for any elemental ^/y , the expectation (^/y) - ^ for all ^. 



12 



Appendix 2: Completeness 

Here we prove that for an estimator ^ of the tail parameter ^ of the GPD, with the preconditions that ^ is i) 
a Unear combination of log-spacings, ii) absolutely-unbiased for all ^ and iii) location- and scale-invariant, 
then f may be expressed as a linear combination of the elementals. That is, the elementals form a complete 
basis for the set of invariant unbiased log-spacing estimators of the GPD tail parameter 

Precondition ii requires that ^ must be unbiased at both ^ = y and ^ = -y for any y > 0. This symmetry 
is embodied in Eqn. (JTl, addition and subtraction of which (together with precondition iii and elementary 
integrations such as Eqns [T0lfT2l i leads to the requirements on the log-spacing weights fl,y that 

yY,aij({logGj) + {\ogGi)) 

= yY,aij[HJ) + ^ii)] = -27 (23) 

yJ]au{{iogGj}-{\ogGi}) 

= r 2 atj [i/f{j) - iA(0] = -2 Xi «'7<log( 1 - 'Pi)) (24) 

where i// is the digamma function and 2 means sum over / from 1 to A^ - 1 and j from / + 1 to A^. 

We now prove that the preconditions imply that the right-hand side of Eqn (l24l i is zero. Eqn (l24l i may be 

written as 

ci7 = ^ aij{log{l - 0p> := h where ci = -(1/2) ^ aij [i//{j) - m] (25) 

Similar to Eqn (fTOt . each term in the I\ summation may be considered individually and all variables irrele- 
vant to that term may be integrated out to give 



/i = ^ aijQj j Gf '(1 - Gj)"-^ dGj . j 0,/-'(l - 0o)'"'"' logd - ^P dcf>ij 



(26) 



The Gj integral gives a beta function B{j,n - j + 1) which combines with the C,j term to give a factor 
l/B(i,j - i). Since each expectation integral over (pij is definite, we can set each 0,y - (p. Each product 
(^'"'(1 -0)^ ' ' involves integer exponents and can thus be expressed as a polynomial of degree y-2. Passing 
the summation through the integral sign, the various polynomials can be collected into a single polynomial 
Pn-iW = Yj'IZo bk<f>'' of degree n - 2. Passing the summation back out of the integral gives: 

/i = y bkgkiy) where gk(y) := <p^ log(l - f) dcf> (27) 

The binomial expansion of each ( 1 - (f>y^'^^ factor in Eqn. |26]leads to the total polynomial 

.^^ V V g-l)! v' (-1)^(7 -/-l)! ,(,_iH, ,-„, 

j-^ ^^ (i - !)!(; -1-1)1 ^ ql(j - i - I - qy. 

Collecting together equal powers of (p gives 

Pn-m = y 0^ y y ^-^^^a, — ^^ (29) 

such that the polynomial coefficients bk are 



Zj Zj (i-\)f'i{k-i+\)\(j-k-T)\ 



z 

1=1 M+2 



= "-"Zir.llzf.-i)'-"'-'-'"" «) 
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Having determined the polynomial coefficients, we consider the integrals gkij) of the various 0* terms, as 
defined in Equation ( l27b . The transformation cp''' = {I - p) leads to 



gkiy) 



= - (1 - p)--' log p dp =- — p-'-\\-p)--' 
7 Jo 7 [as Jo 



dp 



Ji=i 



1 k+1 
-B(l, ) 

7 7 



<A(l)-«A(l + ^^) 



1 



k+1 



^(l)_^(l + ill) 



We thus have 



«-2 



n-2 



k=0 k=0 



^(l)_^(l + ill) 



(31) 



(32) 



If this is true at all y > 0, then it must be true for y large, {y - 1 /e, e > 0, e small). There 



gk\7^- 



1 



k+1 



[^(l)-V'(l+(fc+l)e)] 



Since the digamma function is well-behaved (i.e. infinitely differentiable) near i/'(l), we may take the Taylor 

series t//(l + 6) - tf/{l) + i^'(l)6 + 0(6^) to obtain 



thus 



whence 



g-t(r) = -'A'(l)e + 0(62) 

11-2 ^ it—^ 

h = cir = ^ ^Y^bugki-) = -e,p\l)Y^bk + 0{e^) 



n-2 



k=0 



k=0 



n-2 



ci^-e-ij,'{l)j^bt+0{e') 



(33) 



(34) 



(35) 



yl:=0 



Since ci is a constant wrt y and since (nonzero) 6 = 1/y can be arbitrarily small, we thus infer from l35lthat 
the preconditions imply that ci - 0. 

Finally, since the preconditions imply that /i - c\y — and l\ - Ylkii ^* &k(7) the independence of the 
A^ - 1 functions gjS^y) implies that 



/7t = 0, VfcejO, !,...,« -2) 



(36) 



These are the A^ - 1 constraints needed to reduce the dimension of the problem down to that spanned by the 
elementals. 

That the elementals are contained within this subspace can be readily checked by gathering, within each 
hi summation, the terms associated with each elemental crosshair of the grid G. Each bit = constraint 
corresponds to a weighted summation over a subrectangle of the A matrix as illustrated in Table |4] 

Consider an element ajj away from the rectangle boundaries (such as 026 in Table |4]i. Consider the b^ 
summation terms associated with the elemental ^u, which thus involves the term ajj, the term fl/,y_i to its 
left and the term aj+ij below it. The bk summation weights given to each of these terms can be obtained 
fromEqn[30las 



bk,(ij-i) = 
and bkii+ij) = 



(7-l)!(+l) 




{I - l)\{k - I + 1)\{J - 

(y-2)!(+i) 


■k-2)\ 


{I - i)\{k - 1 + ly.ij - 

(7- !)!(-!) 


-/t-3)! 



(mk-mj-k-2)\ 



(37) 
(38) 
(39) 
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■ an 


an 


a\A 


fll5 


ai6 


a\i 




an 


aiA 


025 


a26 


an 






(334 


^35 


036 


ayi 








a^-i 


046 


047 




fl56 


fl57 












fl67 



Table 4: Each constraint hk = is a summation over a rectangle of the A matrix. For N - 1, the rectangle 
of terms for fe = 3 is shown. 

The elemental |/y contributes in proportions -(/ - / - 1), (7 - 1) and -I to ajj, aij^\ and fl/+i,y 
respectively. The weights in the bk summation are thus such as to eliminate the contribution from the 
elemental ^/y, since it readily follows from the above that 



-(/-/- Dbujj) + (J - 

oc -{J-I-l) + {k- 

= 



^)bk.(i,j-i) + (-t)bk,(i+i.J) 
■I+l) + (J-k-2) 



(40) 



(41) 



(The proof for elements near the boundaries of the bk rectangle is similar). 

Since any elemental satisfies the constraints then so does any linear combination thereof, and since there 
are (A' - 1)(A' - 2)/2 elementals and they are independent, it follows that they form a complete basis for 
those estimators of the GPD tail index that satisfy the preconditions given. 
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